A Survey of Inference Control Methods for Privacy-Preserving Data Mining
نویسنده
چکیده
Inference control in databases, also known as Statistical Disclosure Control (SDC), is about protecting data so they can be published without revealing confidential information that can be linked to specific individuals among those to which the data correspond. This is an important application in several areas, such as official statistics, health statistics, e-commerce (sharing of consumer data), etc. Since data protection ultimately means data modification, the challenge for SDC is to achieve protection with minimum loss of the accuracy sought by database users. In this chapter, we survey the current state of the art in SDC methods for protecting individual data (microdata). We discuss several information loss and disclosure risk measures and analyze several ways of combining them to assess the performance of the various methods. Last but not least, topics which need more research in the area are identified and possible directions hinted.
منابع مشابه
Survey on Recent Developments in Privacy Preserving Models
Privacy preserving in data mining [1] is one of the major and increasingly interested area of research under data security. Privacy will be provided for data at different levels such as, while publishing the data, at the time of retrieving result by preserving sensitive data without disclosing it. It is not just sufficient to preserve sensitive data without disclosing it, but also need to manip...
متن کاملA General Survey of Privacy-Preserving Data Mining Models and Algorithms
In recent years, privacy-preserving data mining has been studied extensively, because of the wide proliferation of sensitive information on the internet. A number of algorithmic techniques have been designed for privacy-preserving data mining. In this paper, we provide a review of the state-of-the-art methods for privacy. We discuss methods for randomization, k-anonymization, and distributed pr...
متن کاملPrivacy-Preserving Distributed Data Mining Techniques: A Survey
In various distributed data mining settings, leakage of the real data is not adequate because of privacy issues. To overcome this problem, numerous privacy-preserving distributed data mining practices have been suggested such as protect privacy of their data by perturbing it with a randomization algorithm and using cryptographic techniques. In this paper, we review and provide extensive survey ...
متن کاملDevelopments and Directions
This article first describes the privacy concerns that arise due to data mining, especially for national security applications. Then we discuss privacy-preserving data mining. In particular, we view the privacy problem as a form of inference problem and introduce the notion of privacy constraints. We also describe an approach for privacy constraint processing and discuss its relationship to pri...
متن کاملA Survey of Randomization Methods for Privacy-Preserving Data Mining
A well known method for privacy-preserving data mining is that of randomization. In randomization, we add noise to the data so that the behavior of the individual records is masked. However, the aggregate behavior of the data distribution can be reconstructed by subtracting out the noise from the data. The reconstructed distribution is often sufficient for a variety of data mining tasks such as...
متن کامل